Learning Dynamic Policies from Demonstration

نویسندگان

  • Byron Boots
  • Dieter Fox
چکیده

We address the problem of learning a policy directly from expert demonstrations. Typically, this problem is solved with a supervised learning method such as regression or classification to learn a reactive policy. Unfortunately, reactive policies lack the ability to model long-range dependancies and this omission can result in suboptimal performance. So, we take a different approach. We observe that policies and dynamical systems are mathematical duals, and then use this fact to leverage the rich literature on system identification to learn dynamic policies with state directly from demonstration. Many system identification algorithms have desirable properties like the ability to model long-range dependancies, statistical consistency, and efficient off-the-shelf implementations. We show that by employing system identification algorithms to learning from demonstration problems, all of these properties can be carried over to the learning from demonstration domain. We further show that these properties can be beneficial in practice by applying state-of-the-art system identification algorithms to real-world direct learning from demonstration problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Movement Primitives

This paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor co...

متن کامل

Multi-Step Learning to Search for Dynamic Environment Navigation

While navigation could be done using existing rule-based approaches, it becomes more attractive to use learning from demonstration (LfD) approaches to ease the burden of tedious rule designing and parameter tuning procedures. In our previous work, navigation in simple dynamic environments is achieved using the Learning to Search (LEARCH) algorithm with a proper feature set and the proposed data...

متن کامل

Learning Stable Task Sequences from Demonstration with Linear Parameter Varying Systems and Hidden Markov Models

The problem of acquiring multiple tasks from demonstration is typically divided in two sequential processes: (1) the segmentation or identification of different subgoals/subtasks and (2) a separate learning process that parameterizes a control policy for each subtask. As a result, segmentation criteria typically neglect the characteristics of control policies and rely instead on simplified mode...

متن کامل

Learning from Demonstration: Communication and Policy Generation

Learning from demonstration utilizes human expertise to program a robot. We believe this approach to robot programming will facilitate the development and deployment of general purpose personal robots that can adapt to specific user preferences. Demonstrations can potentially take place across a wide variety of environmental conditions. In this paper we address how learning from demonstration c...

متن کامل

Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot

Task demonstration is an effective technique for developing robot motion control policies. As tasks becomemore complex, however, demonstration can becomemore difficult. In this work, we introduce an algorithm that uses corrective human feedback to build a policy able to performanovel task, by combining simpler policies learned from demonstration. While some demonstration-based learning approach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013